Hybrid Techniques of X-ray Analysis to Predict Knee Osteoarthritis Grades Based on Fusion Features of CNN and Handcrafted

Knee osteoarthritis (KOA) is a chronic disease that impedes movement, especially in the elderly, affecting more than 5% of people worldwide. KOA goes through many stages, from the mild grade that can be treated to the severe grade in which the knee must be replaced. Therefore, early diagnosis of KOA is essential to avoid its development to the advanced stages. X-rays are one of the vital techniques for the early detection of knee infections, which requires highly experienced doctors and radiologists to distinguish Kellgren-Lawrence (KL) grading. Thus, artificial intelligence techniques solve the shortcomings of manual diagnosis. This study developed three methodologies for the X-ray analysis of both the Osteoporosis Initiative (OAI) and Rani Channamma University (RCU) datasets for diagnosing KOA and discrimination between KL grades. In all methodologies, the Principal Component Analysis (PCA) algorithm was applied after the CNN models to delete the unimportant and redundant features and keep the essential features. The first methodology for analyzing x-rays and diagnosing the degree of knee inflammation uses the VGG-19 -FFNN and ResNet-101 -FFNN systems. The second methodology of X-ray analysis and diagnosis of KOA grade by Feed Forward Neural Network (FFNN) is based on the combined features of VGG-19 and ResNet-101 before and after PCA. The third methodology for X-ray analysis and diagnosis of KOA grade by FFNN is based on the fusion features of VGG-19 and handcrafted features, and fusion features of ResNet-101 and handcrafted features. For an OAI dataset with fusion features of VGG-19 and handcrafted features, FFNN obtained an AUC of 99.25%, an accuracy of 99.1%, a sensitivity of 98.81%, a specificity of 100%, and a precision of 98.24%. For the RCU dataset with the fusion features of VGG-19 and the handcrafted features, FFNN obtained an AUC of 99.07%, an accuracy of 98.20%, a sensitivity of 98.16%, a specificity of 99.73%, and a precision of 98.08%.


Introduction
The human body contains many joints, the most important of which is the knee joint. The knee joint connects the thigh with the leg. KOA is one of the most common musculoskeletal system diseases, and it is a chronic disease that leads to disability in the elderly. This disease causes joint pain and progressive knee weakness, which affects more than 5% of people worldwide [1]. There is no effective treatment for KOA, especially when it is of severe grade [2]. Significant factors cause KOA, such as ageing, obesity, and accidental knee injuries [3]. Lack of an early diagnosis of KOA leads to the progression of the disease to a severe grade, in which a complete knee replacement is required. Not all patients with KOA can replace the knee because of its high cost and short life, especially for obese people. Therefore, early diagnosis of KOA is necessary to start treatments that stop KOA from progressing to its dangerous stages and to start drug and behavioral

•
Combining the features of the VGG-19 and ResNet-101 models before and after PCA.

•
Combining the features of VGG-19 and ResNet-101 separately with the handcrafted features called fusion features The remainder of this paper is organized as follows: Section 2 discusses the methods and findings of related work to classify KOA. Section 3 describes the materials and methods used for the X-ray analysis of the two datasets of OAI and RCU of KOA. Section 4 presents the results achieved by the proposed systems for X-ray analysis of the two knee osteoarthritis datasets. Section 5 discusses the performance of the systems and compares their results. Section 6 concludes the paper.

Related Work
Bayramoglu et al. [9] provided a CNN model for KOA detection from an X-ray image dataset. The BoneFinder tool selected the patella area of interest, and features were extracted using the LBP method to describe the texture of the Region of Interest (ROI). The model reached an AUC of 81.7% and an AP of 48.7%. The model's performance improved with the ROI, achieving an AUC of 88.9% and an AP of 71.4%. Cheung et al. [10] presented several machine learning algorithms and CNN to analyze knee joint X-rays by analyzing KOA's KL grade. They provided CNN maps to extract the radiological features that impact the network's decision. CNN has found better results than machine learning, with an AUC of 99.86%, compared with the best machine learning algorithm, which has an AUC of 41.27%. Tiulpin et al. [11] developed a deep learning model with a Res-Ne architecture for JSW joint space prediction for KOA determination. The model works on segmentation, such as the knee area, to benefit from determining the minimum JSW, and achieved a fragmentation rate of 98.9%. The XGBoost classifier also achieved an AUC of 62.1% by analyzing X-rays to predict KL grade and KOA progression. Javed et al. [12] developed a pre-trained residual network to predict KL grades by analyzing radiographs. A network performance validation was performed with a multicenter dataset. The network achieved an accuracy of 98% and an AUC of 98%. Teo et al. [13] presented the pre-trained InceptionV3 and DenseNet201 models to extract the features of the OAI data set, which is split into five classes according to the severity of the osteoarthritis. Deep learning model features are sent to the SVM classifier for classification. DenseNet201-SVM achieved an accuracy of 71.33%. Tri et al. [14] developed a DCNN for early classification of KOA severity based on analyzing X-rays and extracting features from them. The mesh showed a mean accuracy of 77.24% for each fold of each stage. Yaorong et al. [15] developed a model of clustering algorithm and machine learning to detect knee edges from X-rays to predict the stages of osteoarthritis development. The clustering algorithm works to get data from each X-ray. Complex data was converted to simple for each image and saved to a vector. Finally, machine learning algorithms were applied to analyze the features and predict the severity of OA. Yibo et al. [16] introduced a model consisting of a spatial attention module to improve data extraction from knee X-rays and suppress unnecessary data. Then, the data was merged for all branches of attention units. Mish's activation function had been set to enable model convergence and improve performance. The model reached an accuracy of 70.23% and F1 scores of 67.55%. Sophal et al. [17] created a model to select ROI from knee X-rays and extract and classify shape features to distinguish between osteoarthritis images and their severity. The ROI was selected by using Otsu's method from X-rays. Features were reduced by selecting features and feeding them to classifiers to categorize them. The model reached an AUC of 91.7%.
The researchers were able to reach satisfactory results using various methods and materials. The promising accuracy of X-ray image analysis for early detection of KOA remains the goal of every researcher. This study is distinguished from previous studies by the diversity of methods and hybrid materials applied to reach high accuracy. Because of the similarity of KOA in the early stages and the difficulty of determining the intensity of KL grading, this challenge was overcome by extracting the features from more than one deep learning model and combining and then classifying them by FFNN. Moreover, deep learning features were extracted and combined with handcrafted features and then classified by FFNN.

Materials and Methods
The methodology for X-ray analysis of the OAI and RCU datasets for discriminating KOA severity grades. The following subsections discuss the performance of each method as shown in Figure 1.
The handcrafted features are important for categorizing any image into which class it belongs, but they do not reach a high resolution. Thus, the handcrafted features have limitations in achieving satisfactory accuracy. The advantages of CNN models is their ability to extract subtle and hidden features, and this is what distinguishes them from machine learning. Thus, combining handcrafted features and CNN features will produce representative feature vectors and then achieve promising accuracy. The handcrafted features are important for categorizing any image into which class it belongs, but they do not reach a high resolution. Thus, the handcrafted features have limitations in achieving satisfactory accuracy. The advantages of CNN models is their ability to extract subtle and hidden features, and this is what distinguishes them from machine learning. Thus, combining handcrafted features and CNN features will produce representative feature vectors and then achieve promising accuracy.

Description of Two Datasets
Osteoarthritis is a degenerative disease of the articular cartilage of the knee due to the lack of the soft, slippery substance that protects the bones from friction. In this study, the proposed systems were evaluated on the two OAI datasets and the Rani Channamma University (RCU) dataset to analyze X-rays to detect knee arthritis and the severity of KL grading. The first OAI dataset consists of 9786 X-rays divided into five classes according to the severity of the knee joint osteoarthritis according to the KL grading as follows: 3857 X-rays for Grade 0 (Healthy), 1770 X-rays for Grade 1 (Doubtful), 2578 X-rays for Grade 2 (Minimal), 1286 X-rays for Grade 3 (Moderate), and 295 X-rays for Grade 4 (Severe) [18]. The second dataset of the RCU consists of 1650 X-rays divided into five classes according to the severity of the knee joint osteoarthritis according to the KL grading as follows: 514 X-rays for Grade 0 (Healthy), 477 X-rays for Grade 1 (Doubtful), 232 X-rays for Grade 2 (Minimal), 221 X-rays for Grade 3 (Moderate), and 206 X-rays for Grade 4 (Severe) [19]. Table 1 describes the two sets of data and the KOA severity according to KL grading.

Description of Two Datasets
Osteoarthritis is a degenerative disease of the articular cartilage of the knee due to the lack of the soft, slippery substance that protects the bones from friction. In this study, the proposed systems were evaluated on the two OAI datasets and the Rani Channamma University (RCU) dataset to analyze X-rays to detect knee arthritis and the severity of KL grading. The first OAI dataset consists of 9786 X-rays divided into five classes according to the severity of the knee joint osteoarthritis according to the KL grading as follows: 3857 X-rays for Grade 0 (Healthy), 1770 X-rays for Grade 1 (Doubtful), 2578 X-rays for Grade 2 (Minimal), 1286 X-rays for Grade 3 (Moderate), and 295 X-rays for Grade 4 (Severe) [18]. The second dataset of the RCU consists of 1650 X-rays divided into five classes according to the severity of the knee joint osteoarthritis according to the KL grading as follows: 514 X-rays for Grade 0 (Healthy), 477 X-rays for Grade 1 (Doubtful), 232 X-rays for Grade 2 (Minimal), 221 X-rays for Grade 3 (Moderate), and 206 X-rays for Grade 4 (Severe) [19]. Table 1 describes the two sets of data and the KOA severity according to KL grading. Figure 2 describes a set of images of the two OAI and RCU datasets for all KL grading of osteoarthritis.

Improving X-ray of Two Datasets of Knee OA
Factors such as the different X-ray machines, surrounding factors, light reflections, movement of the patient's knee during imaging, and other issues constitute noise in the X-rays, which leads to the deterioration of the performance of artificial intelligence systems. Thus, all these artifacts must be removed and the variance of the knee joint, medial femoral, and osteophytes increased.
In this study, the average filter and Contrast-limited Adaptive Histogram Equalization (CLAHE) method were applied to improve the X-rays of KOA.
First, all X-rays of the OAI and RCU datasets were passed to an average filter to remove noise. The filter selects 16 pixels in each iteration distributed into a target pixel and 15 adjacent pixels. Then, the filter calculates the average of 15 neighboring pixels, removes the value of the target pixel, and replaces it with the average of its neighbors as in Equation (1). The filter continues and processes each pixel in the X-ray [20].
where f (m) refers to the input, s(m − i) refers to the old input, and p refers to the number of pixels. Second, the X-rays were passed after removing the noise to the CLAHE method to increase the visibility of the knee joint and all the bony details adjacent to the knee. The method distributes the bright pixels to the dark areas. Each time the technique compares a target pixel with neighboring pixels, the contrast increases or decreases according to the pixel value of the neighbors [21]. When a pixel's value is less than its neighbors, its contrast decreases, while its contrast increases when its value is more than its neighbors. The method continues and each pixel is compared to its neighbors to increase or decrease its contrast. Figure 3 shows a set of X-rays of the two OAI and RCU datasets for all KL grading of osteoarthritis after improvement. It should be noted that the images in Figure 2 are the same as in Figure 3 after improvement. where refers to the input, − refers to the old input, and p refers to the number of pixels.
Second, the X-rays were passed after removing the noise to the CLAHE method to increase the visibility of the knee joint and all the bony details adjacent to the knee. The method distributes the bright pixels to the dark areas. Each time the technique compares a target pixel with neighboring pixels, the contrast increases or decreases according to the pixel value of the neighbors [21]. When a pixel's value is less than its neighbors, its contrast decreases, while its contrast increases when its value is more than its neighbors. The method continues and each pixel is compared to its neighbors to increase or decrease its contrast. Figure 3 shows a set of X-rays of the two OAI and RCU datasets for all KL grading of osteoarthritis after improvement. It should be noted that the images in Figure 2 are the same as in Figure 3 after improvement.

FFNN with CNN Features
This Section 3.3 discusses the techniques and materials applied to analyze the X-rays of the two OAI and RCU datasets to detect the severity grade of osteoarthritis. Training a dataset using CNN models takes a long time, sophisticated computers, and is expensive, and despite this, it may not reach satisfactory accuracy [22]. Therefore, this technique was applied, which consists of two parts: CNN to extract features and FFNN to classify features quickly and accurately.

Extract Deep Features
Artificial intelligence techniques, particularly CNN models, have been inputted in many fields to serve humanity, and the medical side has received a golden share of artificial intelligence techniques. CNN is distinguished by its superior ability in health care, especially in analyzing and processing biomedical images, due to its exceptional ability to extract hidden data [23].

FFNN with CNN Features
This Section 3.3 discusses the techniques and materials applied to analyze the X-rays of the two OAI and RCU datasets to detect the severity grade of osteoarthritis. Training a dataset using CNN models takes a long time, sophisticated computers, and is expensive, and despite this, it may not reach satisfactory accuracy [22]. Therefore, this technique was applied, which consists of two parts: CNN to extract features and FFNN to classify features quickly and accurately.

Extract Deep Features
Artificial intelligence techniques, particularly CNN models, have been inputted in many fields to serve humanity, and the medical side has received a golden share of artificial intelligence techniques. CNN is distinguished by its superior ability in health care, Diagnostics 2023, 13, 1609 7 of 26 especially in analyzing and processing biomedical images, due to its exceptional ability to extract hidden data [23].
CNN comprises dozens of layers that extract all the data from X-rays of KOA, even hidden, that experts do not see. The essential layers that analyze images to extract their data are convolutional layers, pooling layers, and some auxiliary layers [24]. This study analyzed X-rays of KOA and extracted features using VGG-19 and ResNet-101 models through deep convolutional layers. Convolutional layers are one of the essential layers of CNN, and each layer has a particular task for analyzing and extracting X-ray features. Some layers extract color features, and some focus on extracting the edges of the ROI. Other layers increase the contrast of the crucial areas and there are layers to extract the geometric features, so each layer performs a specific task. In the end, all the features are integrated to produce features representative of each image. Convolutional layers depend on the filter f (t) size that wraps around the image x(t) to be processed, as in Equation (2).
where W(t) refers to the output, f(t) refers to the size of the filter, and x(t) refers to the X-ray inputted.
Convolutional layers produce millions of neurons, which requires computational complexity and long training times. CNN solves this challenge through pooling layers that reduce the number of neurons and weights through two methods, max and average pooling [25]. The max layers select a set of pixels, compare each pixel with the other, select the max value, and put it instead of the selected pixels, as in Equation (3) [25]. The average pooling layers select a group of pixels, calculate its average, and put it instead of the selected pixels, as in Equation (4) [26].
z(i; j) = 1 where m, n means the location in a matrix, p means the stride of the filter, f means the filter size, and k means the features in vectors. There are also auxiliary layers after convolutional layers, such as the ReLU layer, which further improves the output by passing positive values and suppressing negative values, as in Equation (5).
To avoid overfitting problems, the dropout layer is set to 50%, which passes 50% of the data each time.
The VGG-19 and ResNet-101 models produce features with a size of 9786 × 2048 and 1650 × 2048 for the OAI and RCU datasets, respectively. It is noted that the resulting features are high-dimensional, so these features were passed to the PCA method to delete the redundant and non-significant features and save the essential features with a size of 9786 × 465 and 1650 × 465 for the two datasets of OAI and RCU, respectively.

FFNN Network
FFNN is a highly efficient neural network for solving classification tasks, including medical image processing. Classification tasks are solved through three basic layers. The input layer receives the features sent from the CNN models. The input layer contains 465 input units according to the number of features for each image. The features pass through 15 hidden layers in which complex operations are performed to perform the required tasks. The output layer contains five neurons for each of the two datasets according to the grade of KOA. The data passes in the network from the input layer in the forward direction, and the weights of the neuron in the next layer are calculated according to the value of the previous neuron with its weight. Each time the weight is updated, the minimum square error (MSE) is calculated between the actual x i and expected y i values. The network continues until it reaches the stage of stability, where the weights do not change. Then, the network chooses the weights with the MSE as in Equation (6).
where n means the number of features, x i means the actual output, and y i means the expected output [27]. Figure 4 illustrates the X-ray analysis methodology of the two OAI and RCU datasets for diagnosing KOA and discrimination of severity grade of the osteoarthritis by VGG-19-FFNN and ResNet-101-FFNN techniques. Then, the network chooses the weights with the MSE as in Equation (6).
where n means the number of features, means the actual output, and means the expected output [27]. Figure 4 illustrates the X-ray analysis methodology of the two OAI and RCU datasets for diagnosing KOA and discrimination of severity grade of the osteoarthritis by VGG-19-FFNN and ResNet-101-FFNN techniques.

FFNN with Fusion of CNN Features
This section discusses the techniques and materials applied for analyzing X-rays of the OAI and RCU datasets for detecting severity grade of KOA. Training a dataset using CNN models is time-consuming, sophisticated, and requires expensive computers and, despite this, may not reach satisfactory accuracy. So, this technique was applied, consisting of two parts: VGG-19 and ResNet-101 models for feature extraction and merging and FFNN for quick and accurate feature classification [28].
The methodology of this section consists of two systems based on combining the features of VGG-19 and ResNet-101 as follows. The first system extracts features from VGG-19 and ResNet-101 separately, then the features are merged and fed into the PCA to eliminate the repeated and unessential parts and keep the essential features. In the second system, features are extracted from VGG-19 and ResNet-101 separately; then, they are fed into the PCA separately to eliminate those that are redundant and unessential and keep the essential features. Finally, the essential features of the VGG-19 and ResNet-101 models are combined. Figure 5 illustrates the X-ray analysis methodology of the two OAI and RCU datasets

FFNN with Fusion of CNN Features
This section discusses the techniques and materials applied for analyzing X-rays of the OAI and RCU datasets for detecting severity grade of KOA. Training a dataset using CNN models is time-consuming, sophisticated, and requires expensive computers and, despite this, may not reach satisfactory accuracy. So, this technique was applied, consisting of two parts: VGG-19 and ResNet-101 models for feature extraction and merging and FFNN for quick and accurate feature classification [28].
The methodology of this section consists of two systems based on combining the features of VGG-19 and ResNet-101 as follows. The first system extracts features from VGG-19 and ResNet-101 separately, then the features are merged and fed into the PCA to eliminate the repeated and unessential parts and keep the essential features. In the second system, features are extracted from VGG-19 and ResNet-101 separately; then, they are fed into the PCA separately to eliminate those that are redundant and unessential and keep the essential features. Finally, the essential features of the VGG-19 and ResNet-101 models are combined. Figure 5 illustrates the X-ray analysis methodology of the two OAI and RCU datasets for discriminating the severity of osteoarthritis by integrating features of VGG-19 and ResNet-101 before and after PCA. For the first system, the X-rays of the two OAI and RCU datasets for diagnosis of the severity of grade of KOA are analyzed in several steps as follows.
Firstly, the X-rays were improved, with better appearance of the knee joint through the average filter and CLAHE method. Secondly, the optimized X-rays were fed to the VGG-19 for analysis and extraction of the important and hidden features by convolutional layers, saving them at a size of 9786 × 2048 and 1650 × 2048 for the OAI and RCU datasets of osteoarthritis, respectively.
Thirdly, feeding the improved X-rays to ResNet-101 for analysis and extracting important and hidden features by convolutional layers and saving them at a size of 9786 × 2048 and 1650 × 2048 for the OAI and RCU datasets of osteoarthritis, respectively.
Fourthly, integrating the features of VGG-19 and ResNet-101 and saving them at a size of 9786 × 4096 and 1650 × 4096 for the OAI and RCU datasets of osteoarthritis, respectively.
Fifthly, feeding the merged features of size 9786 × 4096 and 1650 × 4096 to the PCA method to remove redundant and unnecessary features and keep the necessary features of size 9786 × 760 and 1650 × 760 for the OAI and RCU datasets of osteoarthritis, respectively.
Sixthly, feeding essential features with a size of 9786 × 760 into FFNN for training and system performance testing.
Seventhly, feeding the essential features with a size of 1650 × 760 into FFNN for training and system performance testing. For the second system, the X-rays of the two OAI and RCU datasets for diagnosis of the severity of grade of KOA are analyzed in several steps as follows: The first three steps of the second system are the same as the first system. Fourthly, feeding the VGG-19 features into the PCA method to remove redundant For the first system, the X-rays of the two OAI and RCU datasets for diagnosis of the severity of grade of KOA are analyzed in several steps as follows.
Firstly, the X-rays were improved, with better appearance of the knee joint through the average filter and CLAHE method. Secondly, the optimized X-rays were fed to the VGG-19 for analysis and extraction of the important and hidden features by convolutional layers, saving them at a size of 9786 × 2048 and 1650 × 2048 for the OAI and RCU datasets of osteoarthritis, respectively.
Thirdly, feeding the improved X-rays to ResNet-101 for analysis and extracting important and hidden features by convolutional layers and saving them at a size of 9786 × 2048 and 1650 × 2048 for the OAI and RCU datasets of osteoarthritis, respectively.
Fourthly, integrating the features of VGG-19 and ResNet-101 and saving them at a size of 9786 × 4096 and 1650 × 4096 for the OAI and RCU datasets of osteoarthritis, respectively.
Fifthly, feeding the merged features of size 9786 × 4096 and 1650 × 4096 to the PCA method to remove redundant and unnecessary features and keep the necessary features of size 9786 × 760 and 1650 × 760 for the OAI and RCU datasets of osteoarthritis, respectively.
Sixthly, feeding essential features with a size of 9786 × 760 into FFNN for training and system performance testing.
Seventhly, feeding the essential features with a size of 1650 × 760 into FFNN for training and system performance testing. For the second system, the X-rays of the two OAI and RCU datasets for diagnosis of the severity of grade of KOA are analyzed in several steps as follows: The first three steps of the second system are the same as the first system. Fourthly, feeding the VGG-19 features into the PCA method to remove redundant and unnecessary features and keep the necessary features at a size of 9786 × 465 and 1650 × 465 for the OAI and RCU datasets of osteoarthritis, respectively.
Fifthly, feeding the ResNet-101 features into a PCA method to remove redundant and unnecessary features and retain the necessary features at a size of 9786 × 465 and 1650 × 465 for the OAI and RCU datasets of osteoarthritis, respectively.
Sixthly, integrating the features of VGG-19 and ResNet-101 and saving them at a size of 9786 × 930 and 930 × 4096 for the OAI and RCU datasets of osteoarthritis, respectively.
Seventhly, feeding essential features with a size of 9786 × 930 and 1650 × 930 for the OAI and RCU datasets of osteoarthritis, respectively, into FFNN for training and system performance testing.

FFNN Network with Hybrid Features of CNN and Handcrafted Features
This section discusses the techniques and materials applied for analyzing X-rays of the OAI and RCU datasets to detect the severity grade of KOA. Training the dataset using CNN models takes a long time, is complicated and costly for computers, and may not reach satisfactory accuracy. So, this technique, which consists of two parts, has been applied: two models of VGG-19 and Resnet-101 to extract the features separately and combine them with the features of GLCM, DWT, and LPB methods.
The methodology of this section consists of two systems that depend on the fusion features extracted in a way that combines CNN features with handcrafted features. Figure 6 shows the methodology of X-ray analysis of the two OAI and RCU datasets for diagnosing and discriminating the severity of osteoarthritis through fusion features of VGG-19 and handcrafted features, in addition to fusion features of ResNet-101 and handcrafted features.
The methodology of this technique for X-ray analysis of the two OAI and RCU datasets to diagnose the severity of osteoarthritis of the knee in several steps is as follows: First, the X-rays were enhanced, and the contrast of the knee joint was augmented by an average filter and the CLAHE method.
Second, the enhanced knee X-rays were fed to VGG-19 and ResNet-101 separately for analysis and minute and hidden features were extracted by convolutional layers; they were saved at a size of 9786 × 2048 and 1650 × 2048 for the OAI and RCU datasets of KOA, respectively.
Third, feeding features of the VGG-19 and ResNet-101 separately into the PCA method to remove redundant and unnecessary features and keep the necessary features at a size of 9786 × 465 and 1650 × 465 for the OAI and RCU datasets of osteoarthritis, respectively.
Fourth, extracting geometric and texture features through GLCM, DWT, and LBP methods and combining them, which are called handcrafted features [29].
Enhanced X-rays are fed to DWT for extraction and analysis and geometry features. This method has four filters; therefore, the X-rays are divided into four parts for analysis. Each filter serves to analyze one part of the X-ray. The first X-ray part is passed to the low filter to analyze its approximate components and extract three statistical features. The X-rays' second and third parts are passed to the Low-High and High-Low filters to analyze their detailed components and extract three statistical features from each part [30]. In the fourth part, the X-ray is passed to the high filters to analyze their detailed components and extract three statistical features. Thus, the four filters produced 12 features of size 9786 × 12 and 1650 × 12 for the OAI and RCU datasets of osteoarthritis, respectively. bine them with the features of GLCM, DWT, and LPB methods.
The methodology of this section consists of two systems that depend on the fusion features extracted in a way that combines CNN features with handcrafted features. Figure 6 shows the methodology of X-ray analysis of the two OAI and RCU datasets for diagnosing and discriminating the severity of osteoarthritis through fusion features of VGG-19 and handcrafted features, in addition to fusion features of ResNet-101 and handcrafted features. Figure 6. Approaches of the X-ray analysis of the two OAI and RCU datasets for diagnosing osteoarthritis of the knee and discrimination of severity grade using FFNN with fusion features.
The methodology of this technique for X-ray analysis of the two OAI and RCU datasets to diagnose the severity of osteoarthritis of the knee in several steps is as follows: Figure 6. Approaches of the X-ray analysis of the two OAI and RCU datasets for diagnosing osteoarthritis of the knee and discrimination of severity grade using FFNN with fusion features.
Enhanced X-rays are fed to the GLCM for analysis and extraction of the texture features of the knee joint. This method converts the X-rays into a grayscale matrix to extract features from the knee area. The method calculates the X-rays' spatial information based on the neighbors' distance and angles. The method decides whether an area is rough or smooth depending on the pixel and its neighbors [31]. If the adjacent pixels are close together, the region is smooth. In contrast, the region is rough if the pixels have different values. Thus, GLCM produces 13 features of size 9786 × 13 and 1650 × 13 for the OAI and RCU datasets of osteoarthritis, respectively.
Enhanced X-rays are fed into the LBP to analyze and extract features of the binary surfaces. This method converts the image into a grayscale matrix for feature extraction. The method calculates the spatial information of the X-rays and counts each pixel with its neighbors to start the processing process. In each iteration of processing a target pixel, the method takes 24 adjacent pixels. The method calculates the target pixel and the neighbors according to Equation (7) and replaces the target pixel with the product of the LBP [32]. The method continues until all pixels are completed and replaced according to the LBP method. Thus, the LBP yields 203 features with sizes of 9786 × 203 and 1650 × 203 for the OAI and RCU datasets of osteoarthritis, respectively.
where g c means the center pixel, R means the contiguous radius, g p means the contiguous pixels, and P means the number of contiguous pixels.
Fifth, the features of the three methods are merged and saved at a size of 9786 × 228 and 1650 × 228 for the OAI and RCU datasets of osteoarthritis, respectively. These are called handcrafted features.
Sixth, the features produced from VGG-19 are combined with the handcrafted features at a size of 9786 × 693 and 1650 × 693 for the OAI and RCU datasets of osteoarthritis, respectively.
Seventh, the features produced from ResNet-101 are combined with the handcrafted features at a size of 9786 × 693 and 1650 × 693 for the OAI and RCU datasets of osteoarthritis, respectively.
Eighth, the essential features with a size of 9786 × 693 and 1650 × 693 for the OAI and RCU datasets of osteoarthritis, respectively, are fed into FFNN for training and system performance testing.

Split of OAI and RCU Datasets
This study aims to develop hybrid systems with high-efficiency hybrid features to distinguish KOA's severity grade accurately. The proposed systems were evaluated on X-rays of the OAI and RCU datasets of the knee. The dataset of OAI and RCU contain 9786 and 1650 X-rays, respectively, divided into five grades for the severity of the KOA, as shown in Table 1. In all systems, the two datasets were divided during the systems training phase and validated by 80%, and 20% of the two datasets were allocated for testing the performance of the proposed systems, as shown in Table 2.

Evaluating Systems
The performance of the systems was evaluated through the confusion matrix and the AUC produced by the systems during the X-ray test phase of the two datasets, OAI and RCU, for diagnosing the severity of osteoarthritis. The confusion matrix represents the X-rays during the testing of the two datasets that were correctly analyzed (TN and TP) and the X-rays that were incorrectly analyzed (FN and FP) [33]. The performance of the systems was measured through the evaluation scales mentioned in Equations (8)- (12).

Balancing with Augmentation Data for the Two Datasets
For CNN models to reach good results, they need to be fed with a large dataset to avoid the problem of overfitting. Many biomedical datasets experience a significant shortage of dataset numbers. Moreover, biomedical datasets face the issue of the imbalance of its classes, which makes the accuracy tend to the type of disease (classes) that has more images. Therefore, these challenges are a limitation of CNN models. These limitations were overcome by applying the X-ray data augmentation technique to the OAI and RCU datasets of osteoarthritis. The lack of X-rays for the OAI and RCU datasets was overcome by data augmentation that artificially augments original X-rays. There are many data augmentation method operations, such as rotating, flipping, shifting, and changing the height and width of the X-ray [34]. The problem of an unbalanced dataset was also overcome by increasing the X-rays differently from one class to another. Table 3 describes the number of X-rays for the two OAI and RCU datasets for KOA during the training of the dataset before and after the data augmentation was applied. If all classes are increased equally, the dataset remains unbalanced. Therefore, each class will be increased differently from the other class. In this work, it is noted that each category (degree) increased the type of KOA severity by an amount different from the other category to balance the two datasets. Figure 7 shows the distribution of X-rays for the two datasets before and after applying the data augmentation method. images. Therefore, these challenges are a limitation of CNN models. These limitations were overcome by applying the X-ray data augmentation technique to the OAI and RCU datasets of osteoarthritis. The lack of X-rays for the OAI and RCU datasets was overcome by data augmentation that artificially augments original X-rays. There are many data augmentation method operations, such as rotating, flipping, shifting, and changing the height and width of the X-ray [34]. The problem of an unbalanced dataset was also overcome by increasing the X-rays differently from one class to another. Table 3 describes the number of X-rays for the two OAI and RCU datasets for KOA during the training of the dataset before and after the data augmentation was applied. If all classes are increased equally, the dataset remains unbalanced. Therefore, each class will be increased differently from the other class. In this work, it is noted that each category (degree) increased the type of KOA severity by an amount different from the other category to balance the two datasets. Figure 7 shows the distribution of X-rays for the two datasets before and after applying the data augmentation method.

Results of FFNN with CNN Features
This section summarizes the results of the systems for analyzing the X-rays of the OAI and RCU datasets for diagnosing the severity of osteoarthritis before it progresses to the severe stage. The VGG-19-FFNN and ResNet-101-FFNN techniques extract features from CNN models and pass them to the PCA to remove redundant features and maintain important features. The important features are sent to FFNN to split the features of the

Results of FFNN with CNN Features
This section summarizes the results of the systems for analyzing the X-rays of the OAI and RCU datasets for diagnosing the severity of osteoarthritis before it progresses to the severe stage. The VGG-19-FFNN and ResNet-101-FFNN techniques extract features from CNN models and pass them to the PCA to remove redundant features and maintain important features. The important features are sent to FFNN to split the features of the two datasets to train the systems and test their performance.

Results of FFNN with Fusion of CNN Features
This section summarizes the results of hybrid systems with hybrid features for analyzing X-rays of the OAI and RCU datasets for diagnosing the severity of osteoarthritis before it progresses to the severe stage. Two systems have been developed based on combining the features of VGG-19 and ResNet-101 before and after the PCA method. The idea of this technique is first to extract the features of VGG-19 and ResNet-101 and then integrate the high-dimensional features. Then, the high dimensions are reduced by PCA. Secondly, the technique extracts the features of VGG-19 and ResNet-101 and then reduces their high dimensions separately. Then, the low-dimensional features are incorporated. The important feature is sent to FFNN to split the features of the two datasets to train the systems and test their performance.

Results of FFNN with Fusion of CNN Features
This section summarizes the results of hybrid systems with hybrid features for analyzing X-rays of the OAI and RCU datasets for diagnosing the severity of osteoarthritis before it progresses to the severe stage. Two systems have been developed based on combining the features of VGG-19 and ResNet-101 before and after the PCA method. The idea of this technique is first to extract the features of VGG-19 and ResNet-101 and then integrate the high-dimensional features. Then, the high dimensions are reduced by PCA. Secondly, the technique extracts the features of VGG-19 and ResNet-101 and then reduces their high dimensions separately. Then, the low-dimensional features are incorporated. The important feature is sent to FFNN to split the features of the two datasets to train the systems and test their performance.

Results of FFNN with Hybrid Features of CNN and Handcrafted Features
This section summarizes the results of hybrid systems with fusion features for X-ray image analysis of OAI and RCU datasets to diagnose the severity of osteoarthritis before it progresses to the severe stage. Two methods were developed by combining CNN features (VGG-19 and ResNet-101) separately with handcrafted features. This technique aims

Results of FFNN with Hybrid Features of CNN and Handcrafted Features
This section summarizes the results of hybrid systems with fusion features for X-ray image analysis of OAI and RCU datasets to diagnose the severity of osteoarthritis before it progresses to the severe stage. Two methods were developed by combining CNN features (VGG-19 and ResNet-101) separately with handcrafted features. This technique aims to extract the features of VGG-19 and ResNet-101 separately and then reduce the high dimensionality by PCA. The important features are sent to FFNN to split the features of the two datasets to train the systems and test their performance. Table 8

Discussion the Performance of the Systems and Comparison Results
Sheik et al. [35] RCNN-trained X-ray images of knee patients to diagnose the knee joint, reaching an accuracy of 98.51%. Simon et al. [36] trained the ResNet network through PyTorch to determine the severity of knee inflammation, which reached an AUC of 92%. Jiangling et al. [37] used an aggregated multiscale dilated convolutional network for feature learning, combined with aggregated multiscale dilated CNN, and achieved an accuracy of 93.6%. Dilovan et al. [38] presented a deep learning model to extract features from X-rays of KOA. These features are then sent to the SVM, Naive Bayes, and KNN machine learning classifiers. KNN with deep learning features achieved better results than the rest of the classifiers, which reached an accuracy of 90.01% and a specificity of 87.8%. Rabbia et al. [39] performed an extraction of features from the knee joint space by hybrid features using directed gradient graph and classification by Random Fores, and achieved an accuracy of 97%. Ashish et al. [40] performed a classification of knee inflammation severity images based on adjusting the force parameters and classifying them by Decision Tree, achieving an accuracy of 91%.
Here, we review the results of the systems and compare the performance as follows.
Knee osteoarthritis is one of the most common diseases of the musculoskeletal system that disturbs life, and it is a chronic disease that leads to disability, especially for the elderly [41]. This disease causes joint pain and knee weakness, and late diagnosis leads to joint replacement, which is very expensive [42]. KOA goes through many stages from grade 0 to grade 4, called KL-grading [43]. The initial stages of KL grading are similar. Therefore, manual diagnosis by doctors and experts cannot notice the exact symptoms and characteristics that distinguish each grade from the other [44]. Thus, deep learning techniques can extract subtle and hidden features that are not noticed by manual diagnosis [45]. In this study, three methodologies were developed; each methodology has two different systems for analyzing X-rays for KL-grading of KOA.

Discussion the Performance of the Systems and Comparison Results
Sheik et al. [35] RCNN-trained X-ray images of knee patients to diagnose the knee joint, reaching an accuracy of 98.51%. Simon et al. [36] trained the ResNet network through PyTorch to determine the severity of knee inflammation, which reached an AUC of 92%. Jiangling et al. [37] used an aggregated multiscale dilated convolutional network for feature learning, combined with aggregated multiscale dilated CNN, and achieved an accuracy of 93.6%. Dilovan et al. [38] presented a deep learning model to extract features from X-rays of KOA. These features are then sent to the SVM, Naive Bayes, and KNN machine learning classifiers. KNN with deep learning features achieved better results than the rest of the classifiers, which reached an accuracy of 90.01% and a specificity of 87.8%. Rabbia et al. [39] performed an extraction of features from the knee joint space by hybrid features using directed gradient graph and classification by Random Fores, and achieved an accuracy of 97%. Ashish et al. [40] performed a classification of knee inflammation severity images based on adjusting the force parameters and classifying them by Decision Tree, achieving an accuracy of 91%.
Here, we review the results of the systems and compare the performance as follows. Knee osteoarthritis is one of the most common diseases of the musculoskeletal system that disturbs life, and it is a chronic disease that leads to disability, especially for the elderly [41]. This disease causes joint pain and knee weakness, and late diagnosis leads to joint replacement, which is very expensive [42]. KOA goes through many stages from grade 0 to grade 4, called KL-grading [43]. The initial stages of KL grading are similar. Therefore, manual diagnosis by doctors and experts cannot notice the exact symptoms and characteristics that distinguish each grade from the other [44]. Thus, deep learning techniques can extract subtle and hidden features that are not noticed by manual diagnosis [45]. In this study, three methodologies were developed; each methodology has two different systems for analyzing X-rays for KL-grading of KOA.
The X-rays of the OAI and RCU datasets contain noise and low contrast of the ROI. Thus, all X-rays were optimized to obtain accuracy in the following stages of medical image processing. Data augmentation was applied to increase the images of the two datasets to overcome the overfitting problems facing CNN and the dataset imbalance problem. In all methodologies, the OAI and RCU datasets were divided into 80% for the training and validation phases of the systems, and 20% was allocated for testing the performance of the systems.
In the first methodology, the improved X-rays were inputted into VGG- 19  In the third methodology, the improved X-rays of the OAI and RCU datasets are entered into VGG-19 and ResNet-101 to extract subtle and hidden features separately. Handcrafted features from the GLCM, DWT, and LPB methods are extracted and combined. For the first system of the third methodology, the VGG-19 features are sent to the PCA to delete the unimportant and redundant features and keep the essential features and then combine them with the handcrafted features. This is called the fusion features. FFNN receives the fusion features to diagnose them with high accuracy. For the OAI dataset, FFNN achieved an accuracy of 99.1%, while with the RCU dataset, FFNN achieved an accuracy of 98.2%. For the second system of the third methodology, the ResNet-101 features are sent to PCA to delete the unimportant and redundant features, keep the essential features, and then combine them with the handcrafted features. This is called the fusion features. FFNN receives the radiological features to diagnose them with high accuracy. For the OAI dataset, FFNN achieved an accuracy of 99%, while with the RCU dataset, FFNN achieved an accuracy of 96.4%. Table 10 summarizes the results achieved by the proposed systems for X-ray analysis of the OAI and RCU datasets of osteoarthritis. The table summarizes the results of the systems and the accuracy of diagnosing each system for each grade in the OAI and RCU data sets. First, for the OAI dataset, the best accuracy for the grade 0 and grade 2 classes of 99.5% and 99.6%, respectively, was by FFNN with fusion features of ResNet-101 and handcrafted. The best accuracy for grade 1 and grade 4 classes of 99.2% and 98.3%, respectively, was by FFNN with fusion features of VGG-19 and handcrafted. The best accuracy for the grade 3 class of 98.8% was by FFNN with fusion features of VGG-19-handcrafted and ResNet-101-handcrafted. It is noted that the results of the proposed systems are significantly superior to previous related studies.
It is noted that the results of the proposed systems are significantly superior to previous studies related to all measures of accuracy, sensitivity, specificity, and AUC.

Conclusions
Osteoarthritis of the knee is a chronic disease that impedes movement, especially in the elderly. Therefore, early diagnosis of knee injury is necessary to avoid its development to the advanced stages, which require the replacement of knee joints. This study developed three X-ray methodologies for analyzing two OAI and RCU datasets for diagnosing osteoarthritis and discriminating between KL grades. The first methodology for diagnosing the degree of osteoarthritis uses two-hybrid systems: VGG19-PCA-FFNN and ResNet101-PCA-FFNN. The second methodology for diagnosing the degree of osteoarthritis by FFNN is based on hybrid features of VGG-19 and ResNet-101 before and after PCA. The third methodology for diagnosing the degree of osteoarthritis by FFNN is based on the fusion features of CNN (VGG-19 and ResNet-101) and handcrafted features.
We conclude that the performance of FFNN with hybrid features between the handcrafted CNN models was better than its performance with only CNN features or with combined CNN features.
For the OAI dataset with fusion features of VGG-19 and handcrafted, FFNN reached an AUC of 99.25%, an accuracy of 99.1%, a sensitivity of 98.81%, a specificity of 100%, and a precision of 98.24%. For the RCU dataset with the fusion features of VGG-19 and handcrafted, the FFNN reached an AUC of 99.07%, an accuracy of 98.20%, a sensitivity of 98.16%, a specificity of 99.73%, and a precision of 98.08%. Data Availability Statement: X-ray images supporting the performance of the systems were obtained from two publicly available online datasets at the following links: -https://www.kaggle.com/ datasets/tommyngx/kneeoa. -https://www.kaggle.com/datasets/tommyngx/digital-knee-xray? select=MedicalExpert-I (accessed on 15 December 2022).